# README
Since the limit of supplementary materials, we provide example data file for each object detection dataset we used in MGC dataset.

Here is the structure of the example data file:
```
[ 
	{
			#Always empty as we don't use any region annotation in Visual Genome 
			region_info: {}                 
			
			#The object annotations from the origin dataset before KOSMOS-G transformation
			object_info:{
						object_local_id:  
						{
								#Boundingbox coordinates ratio before KOSMOS-G transformation
								origin_location: 
								{ 
										XMin:00, 
										XMax:00,
										YMin:00,
										YMax:00,
								},
								
								#The object label id indexed by the corresponding metadata
								label_id: int,
								
								#The attribute label id indexed by metadata
	 							attributes_id:[], 
								                               
						},
			}
			
			
			#The region annotations after KOSMOS-G transformation
			adjusted_object_info:{
					adjusted_local_id:{
							#KOSMOS-G transformation is centered on this object (the annotation can be founded in the object_info part. The object_local_id is the origin_local_id.)
							origin_local_id:00 ,  
							
							#Introduced new objects due to the transformation. The value is the IOA between the center object and the new introduced object.
							merged_local_id_2_ioa:{
									'origin_local_id' : float												
							}, 
							
							#The coordinate ratio of the KOSMOS-G transformed region 
							adjusted_XMin: 00,
							adjusted_XMax: 00,
							adjusted_YMin: 00,
							adjusted_YMax: 00,
							
							#the visual tokens of the region produced by lavit tokenizer
							tokenized:[]
					},
			}
			



		# The image-level annotations
		image_info:{  
							
							#the image save path
							image_path:"",  
							
							#the origin height of the image
							origin_image_height: 000,  
							
							#the origin width of the image
							origin_image_width: 000,  
							
							#the source dataset of the image
							source_dataset: "",
							
							#flickrid or cocoid in the Visual Genome
							cocoid: null,      
							flickrid: null,  
							
							#the caption from the original dataset or generated by BLIP 
							caption:[],              
					
							#the visual tokens of the image produced by lavit tokenizer
							tokenized: []
					
							#relationship annotations from the origin dataset.
							relationships:{              
									relationship_local_id: {
											#the subject object in the object_info                 
											subject_local_id:00, 
											
											#the object object in the object_info
											object_local_id:00, 
											
											#the relationship label id indexed by the metadata
											label_id:000 
									},
							}
					}
	},
	...
]
```